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Abstract 

We further develop the Multivariate Decomposition Method (MDM) for the Lebesgue integration 
of functions of infinitely many variables x±, X2, £3, ■ ■ • with respect to a corresponding product of a 
one dimensional probability measure. The method is designed for functions that admit a dominantly 
convergent decomposition / = ]T) u / u , where u runs over all finite subsets of positive integers, and 
for each u = {i \,..., ik} the function f u depends only on x^,, Xi k . 

Although a number of concepts of infinite-dimensional integrals have been used in the liter¬ 
ature, questions of uniqueness and compatibility have mostly not been studied. We show that, 
under appropriate convergence conditions, the Lebesgue integral equals the ‘anchored’ integral, 
independently of the anchor. 

For approximating the integral, the MDM assumes that point values of / u are available for 
important subsets u, at some known cost. In this paper we introduce a new setting, in which it is 
assumed that each / u belongs to a normed space F u , and that bounds B u on ||/ u ||f u are known. 
This contrasts with the assumption in many papers that weights q u , appearing in the norm of the 
infinite-dimensional function space, are somehow known. Often such weights y u were determined 
by minimizing an error bound depending on the B u , the y u and the chosen algorithm, resulting in 
weights that depend on the algorithm. In contrast, in this paper only the bounds B u are assumed 
known. We give two examples in which we specialize the MDM: in the first case F u is the |u|-fold 
tensor product of an anchored reproducing kernel Hilbert space, and in the second case it is a 
particular non-Hilbert space for integration over an unbounded domain. 


1 Introduction 

This paper is intended as a contribution, both theoretical and practical, to the challenging task of 
numerical integration of multivariate functions, when the number of variables is large, and even infinite. 

High-dimensional integration has emerged in recent years as a significant new direction for nu¬ 
merical computation, see [11]. On the one hand the improved processing power of computers has 
encouraged the practical computation of multivariate integrals with very large numbers of variables, 
into the hundreds or thousands or tens of thousands. On the other hand such problems will never be¬ 
come trivial - indeed, many important high-dimensional problems (see below for an example) contain 
parameters for which physically interesting choices can lead to problems of unlimited difficulty. 

Many papers have been written in recent decades about high-dimensional integration, see the 
reviews [3] for sparse grid methods and [10] for Quasi-Monte Carlo methods, and their many references. 

In many applications the number of variables is not merely large but in principle infinite. This 
is the case in the important class of problems of elliptic partial differential equations with uncertain 
coefficients. A key example is Darcy flow through a porous medium with highly variable permeability 
(see e.g., [18]), with the permeability modelled as a random field. Since a continuous random field 
requires an infinite number of scalar random variables for its description, the expected value of any 
property of the flow is in principle an infinite-dimensional integral. Of course in practice the infinite¬ 
dimensional integral has to be truncated to a finite-dimensional integral, but if the correlation length 
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of the permeability field is small, or the variance is large, then the dimensionality might need to be 
very large indeed to capture the essential physics. 

Some other recent papers devoted to infinite-dimensional integration are [1,2,5-7,13-16,21,22,28, 
30,31,34-36,41-45,47-49]. 

The theoretical setting in most of these papers (and the theoretical setting is of key importance, 
given the general lack of useful intuition in high dimensions) has been that of “weighted” reproducing 
kernel Hilbert spaces, where the weights enter the norm of the function space. Often the spaces 
are tensor products of one-dimensional reproducing kernel Hilbert spaces, and the “weights” are of 
the “product” kind introduced by Sloan and and Wozniakowski [38], in which there is one weight 7 j 
for each variable Xj, with 71 > 72 > • • • > 0, with the decreasing weights reflecting the decreasing 
importance of the successive variables. This has been extended to “general weights”, in which case 
there is a potentially different non-negative weight y u for each finite subset u of the natural numbers; 
the product weight case is then recovered by setting y u = riyeuTr I n th ese papers it is assumed that 
the integrand is expressible in the form 

/(*) = /«(*) = /«(*«)> (!) 
|u|<oo |u|<oo 

where the sum (see also (3) below) is over all finite subsets u C N := {1, 2,...}, with |u| denoting the 
cardinality of u, and each function f u depends only on the subset of the variables in u and we write 
x u := {Xj : j G u}. Furthermore, f u is assumed to belong to a normed space F u , which is usually a 
reproducing kernel Hilbert space. The weights determine the importance of different subsets u through 
their appearance in a norm of the form 



in which case the function / belongs to a Banach space T = Commonly q = 2, in which case T 

1/2 

is a Hilbert space, and some papers replace y u by ■ 

Throughout this paper we shall use the convention that in infinite sums over subsets as in (1) 
and (2) the terms are to be ordered in terms of increasing truncation dimension, 

J2 /“(*) := i im J2 /«(*), ( 3 ) 

z ' d—>oo z ' 

|u|<oo uC{l,...,d} 

with the additional subsets when d increases to d + 1 ordered as for the case d with respect to the 
original members. 

The Multivariate Decomposition Method (MDM) proposed in [35,43,44] is a generalisation of the 
Changing Dimension Algorithm in [28,34], Here too it is assumed that an expansion of the form (1) 
exists, and that values of / u (x„), while not available explicitly, can be obtained by a modest number of 
evaluations of f(x). In Section 5 we give specific examples in which values of f u (x u ) can be obtained 
from at most 2l u l evaluations of f(x) - an acceptably small number if the cardinality |u| is small. 

In previous MDM papers [35,43,44] the norm was assumed to be of the form (2). In the present 
paper, in contrast, we do not assume that weights y u are given as a priori information. Rather, we 
assume that the terms /„ in the expansion (1) belong to normed function spaces F u , and crucially, 
that upper bounds on the norms |/ u ||f u are known, i.e., that for |u| < 00 numbers B u satisfying 

ll/ulk < B u (4) 

are given as a priori information. This new setting is equivalent to putting q = 00 and y u = B u in (2), 
but in all other cases the two approaches are not equivalent. Given the bounds B u , one is of course 
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free to select a value of q and a set of weights 7 U that make ( 2 ) finite, but in neither case is the choice 
unique, or is one choice obviously better than another. In the present work we make no explicit use 
of weights. 

The advantage of the current setting, in which bounds B u rather than weights q u are specified, lies 
in its immediate applicability once such bounds are known. We note that a number of recent papers 
have provided directly useable bounds B u : for partial differential equations with random coefficients, 
see [4] for the case of uniformly distributed stochastic variables, and [17,20] for the lognormal case; and 
for generalized response models in statistics, see [37]. It seems likely that similar bounds will be found 
for other applications in the future. In contrast, in practical situations it is typically not clear how to 
choose weights q u , or product weights 7 j. We recall that weights were originally introduced (in [38]) to 
provide a setting in which the tractability of multivariate integration (roughly, to know what happens 
to the worst-case error as d —> 00 ) could be studied. Never was it claimed that weights were naturally 
available in an application. Some recent papers have obtained formulas for suitable weights, but these 
“optimal weights” are deduced by minimising an error bound which depends on the B u , the y u and 
the chosen algorithm. See [26, 27] and [17, 24] for the PDE with random coefficients and randomly 
shifted lattice rules in the uniform and lognormal cases respectively, and [ 8 , 9] for the uniform case 
with higher order digital nets, as well as [23] for a survey of these results. The dependence of weights 
on the algorithm is unacceptable from the point of view of information based complexity, where the 
complexity of the problem (i.e., integration in specified by ( 2 ), and thus depending on the 7 U ) 
is supposed to be studied independently of any algorithm. In contrast, we believe that the present 
setting will provide a robust basis not only for the development of computational schemes, but also 
for future complexity and tractability studies of high-dimensional integration. 

The problem to be considered is that of integration over an infinite-dimensional product region, 

l{f) := [ f(x) dp(x), (5) 

Jd n 

where the integral is in the Lebesgue sense. Here p is the countable product, p = xJfiPi, of a one¬ 
dimensional probability measure pi defined on a Borel set D CM, and D ]: is the set of all infinite 
sequences x = (aq, X 2 , £3, • • •) with Xj € D. We assume that p\ is determined by a probability density 
p on D. 

At this point it is worth mentioning that many papers define the infinite-dimensional integral as 


l a (f) ■= lim / f(xi,...,x d ,a,a,...) dp d (xi,... ,x d ), ( 6 ) 

d— »00 J £)d 

where p d = x^ = 1 /xi, for some fixed aeD, usually with the ‘anchor’ taken as a = 0. Taken on its own 
that definition seems open to question if the value of the integral could depend on the choice of the 
anchor a. We address this uniqueness concern in Section 2, where sufficient conditions are given to 
ensure that the equality 1(f) = l a (f ) holds independently of the choice of a. 

To approximate the integral (5), the MDM uses the decomposition of / given in (1). Indeed, 
assuming that the partial sums in (3) converge dominantly to /, we have 


1(f) = lim Y [ /u(*u) d/X| u |(* u ) =: lim Y 7 u(/u) = X] 7 u(/u)- ( 7 ) 

d—» 00 z ' n lui 1 1 d— »oo zz ' 

uC{l,...,d} uC{l,...,d} |u| <00 

The essence of MDM is that a separate quadrature rule (which in all but a finite number of cases will 
be the zero approximation) is applied to each term f u in the decomposition of /. In more detail, the 
overall algorithm A e for approximating 1(f) up to an error request e has the form 


n u 

•Ae(f) ■= Y A ^riu(fu) ■= Y ^2 w u,iM x u,i), 

ueU(e) uSW(e) *=1 


( 8 ) 
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where the active set U(e) is a finite set of finite subsets of N, and n u . x Ut i and w u ^ are parameters 
of the quadrature rule Ai,n u - In effect, the contributions to 1(f) from terms f u with u outside the 
active set are approximated by zero and thus the construction of U(e) depends on the error request e. 
We note that both the active set U(e) and the algorithm A e are intended to be independent of the 
particular function / £ T once the bounds B u are provided. 

Clearly, the selection of the active set U (e) and the determination of the quadrature rules A UtUu for 
u £ IA(e) are key ingredients of the MDM. To make the selections in a rational way we need to assume 
not only that we have a priori information about the size of the terms f u in the expansion of / in 
the form of the upper bounds B u , but also that we are provided with suitable information about the 
difficulty of the integration problem and the quality of the quadrature rules in F u . 

For two specific applications, we develop an MDM whose worst-case error is upper bounded by 
£ i-S(e) ail( ] t]. ie information cost is proportional to (l/e) 1-1-5 ^ where h(e) > 0 and 5(e) —> 0, under 
quite general assumptions about the bounds B u and the cost of function evaluations. This means 
that for the given application the MDM is almost optimal since even for the corresponding space of 
univariate functions, the minimal cost of computing an e-approxinration is proportional to 1/e for 
these two applications. 

The content of the paper is as follows. In Section 2 we discuss relations between the Lebesgue 
and ‘anchored’ integrals. Then in Section 3 we describe the setting for the MDM and in Section 4 
we develop the MDM in its general form. Then in Section 5 we turn to an important application, 
one that provides the initial motivation for the method. This is the case of the so-called “anchored 
decomposition” associated with anchored reproducing kernel Hilbert spaces, which have very often 
been used in studies of multivariate integration and approximation. We shall see that in this case all 
the assumptions of the MDM are satisfied, and that there is a significant class of integration problems 
for which the MDM can be highly efficient. In Section 6 we consider another application, this one 
of a non-Hilbert space nature. Efficient implementation of MDM will be considered in a forthcoming 
paper [12], 

2 Lebesgue integral and ‘anchored’ integral 

In this section, we compare the Lebesgue integral (5) with the ‘anchored’ integral (6) and show in 
particular that, under suitable assumptions, they are equivalent. 

Recall that D is a Borel subset of R and D N , where N = {1,2,3,...}, is the set of all infinite 
sequences/points x = (xi,X 2 ,x%, ...) with each Xj £ D. Furthermore, p\ is a probability measure on 
the Borel cr-field of D , and we denote by pd = Xj=iPi the d-product of p\ on D d , and by p = x^ =l p\ 
the countable product of p\ on D ]1 . By Lebesgue integral of a function / : D u — y R we mean the 
integral with respect to p, and a.e. means almost everywhere with respect to p. 

In general, the Lebesgue and ‘anchored’ integrals are quite different. Moreover, ‘anchored’ integrals 
may depend on a. A simple example is provided by the function / : R N —>• R such that f(x) = 0 if x 
has only finitely many non-zero coordinates, and f(x) = 1 otherwise. Indeed, then the limit in (6) is 
0 for a = 0, and 1 for a = 1. On the other hand, for the integral (5) we have 1(f) = 1. 

The following well-known Lebesgue’s dominated convergence theorem , see, e.g., [19, Sect. 26], will 
play an important role in our considerations. 

Theorem 1 Let {fd}d >o be a sequence of integrable functions that converges dominantly to f, i.e., 

(i) lim^oo f d (x) = f(x) for x a.e. 

(ii) for some integrable function g we have \fd(x)\ < \g(x)\ for x a.e. 

Then 

1(f) = lim l(f d ). (9) 

a—>oo 
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Theorem 1 can be directly applied to a variety of sequences {f d }d> o- It implies, in particular, that 
if the functions f u in the decomposition ( 1 ) are integrable and 

fd ■= E /u (io) 

uC{l,...,d} 

converge dominantly to /, then the integral can be computed using the equality 

1(f) = lirn E [ /u(*u) d//| u |(* u ). 

d ^°° rW JD\»\ 

(We will use this fact later in the MDM.) Similarly, for fixed a = (oq, 02 , < 23 ,...) € D N we have 

1(f) = l a (f) := lirn / /(xi,...,x rf ,a rf+ i,a d+ 2 ,...)d/id(xi,...,x rf ) ( 11 ) 

a —>-oo J jjd 

if the functions 

D n 3 x = (xi,x 2 ,x 3 ,...) H- f(xi,...,x d ,a d+ i,a d+2 ,...) (12) 

are integrable and converge dominantly to /. That is, we then have equivalence of the Lebesgue and 
‘anchored’ integrals for the anchor a. 

The following result allows us to claim that such equivalence holds for all anchors after checking 
only one sequence, e.g., f d (x) = f(xi,..^x d ,a,a,a,...) for an a £ D. The sufficient condition is 
however stronger than that in Theorem 1. 

Theorem 2 Let {fd } d >0 be a sequence of integrable functions converging dominantly to f, such that 
each fd depends only on the variables x\,... ,x d . Let, in addition, the convergence be uniform a.e., or 
the following less restrictive assumption be satisfied: for x = (xi,X 2 ,x 3 ,...) a.e. 

\fd(x) - f(x)\ < g d (xi,... ,x d ) Vd> 0, (13) 

for some sequence {g d }d >0 of integrable functions that converges dominantly to the zero function. Then 
( 11 ) holds for a = (ai, a 2 , 03 ,...) a.e. 

Moreover, if (13) holds for all x e D ] then (11) holds for all anchors a e D u provided the 
functions ( 12 ) are measurable. 

Proof. By dominated convergence of {f d }d> 0 we know that / is integrable. 

Let V 1 be the set of all points a = ( 01 , 02 , 03 ,...) for which the functions (12) are measurable 
for all d > 0. We show that /r(T>i) = 1. Indeed, since / is measurable and // is a product measure, 
p = fddX P, it follows that for any fixed d the measure of (a d +i,a d+ 2 , ■ ■ •) such that the functions ( 12 ) 
are measurable is 1. Hence p(T> 1 ) = 1 as an intersection of countably many sets of measure 1. 

Let V 2 consist of all a for which 

p[[x : \f d (x) - f(xi,...,x d ,a d+ i,a d + 2 ,---)\ < 9 d(xi, ■ ■ ■ ,x d ) Vd > 0}) =1. (14) 

Then p(T> 2 ) = 1 as well. Indeed, denote by V the collection of all x for which (13) holds. Since 
p(T>) = 1, using again the argument that p is a product measure, we obtain, for a a.e. and fixed 
d > 0 , that 

h{[ x ■ (xi,---,x d , a d+1 ,a d+ 2 ,---) G ^}) = hd({(x i,...,x d ) : (x l: ... ,x d ,a d+l ,a d+ 2 , ■ ■ ■) G T>)^j = 1 . 

Since V 2 is the countable intersection of these sets, it follows that p(T> 2 ) = 1. Concluding this part of 
the proof we have that p(V\ n V 2 ) = 1. 
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Now let a G T>\ n T> 2 - Then for x a.e. 

I/(*) - f(x i,... ,x d ,a d+ i,a d+2 , ■ - .)l < I fd(x) ~ f(x)\ + | f d (x) - f(x i,.. •, x d , a d+1 , a d+2 ,.. .)| 

< 2g d (xi,...,x d ), 

and by dominated convergence of {g d }d >o to the zero function, it follows that 
lim / \f(x) - f(xi,...,x d ,a d+ i,a d+2 ,...)\dn(x) < 2 lim / g d (xi,..., x d ) dfj, d (xi ,..., x d ) = 0, 

d—tOO J£)N d— »0O J£)d 

which implies (11) for a a.e. 

To show the remaining part of the theorem, observe that the actual set of anchors a for which 
we have equvalence of the Lebesgue and ‘anchored’ integrals includes T>\ C\T> 2 . Under the additional 
assumptions we obviously have T>\ = D ]1 and T >2 = D ] . Hence T>\ n T >2 = as claimed. □ 

Example 3 For D = [—a, a] or D = M, consider 


f(xi,x 2 ,...) := (15) 

3 =1 

where X 3 > 0 Vj and < °°; an d assume that E := f D x 2 d/ui < 00 . The function f is well 

defined since even in case D = M (where fi\ could be Gaussian) the set of sequences (xi,X 2 , ■ ■ ■) G M N 
for which the sum (15) is finite is of measure one. The Lebesgue integral equals 


Af) = E 


3 =1 



XjX'j d/Ji 


^E A 

i=i 


j- 


On the other hand, for the ‘anchored’ integral with a = (ai, 02 , 03 ,...) we have 



• • j x d , a d - )-i, Ud+ 2 ) • • •) ..., Xrf) 


d 00 

E ^ X 3 + X] A J a i 

l=i l=rf+i 


which converges to T(f) if and only if 

OO 

^ Aja^ < 00 . (16) 

l=i 

The inequality (16) is a necessary and sufficient condition for 1(f) = l a (f). It holds for all a if 
D = [—a, a], and for a a.e. if D = M. We also observe that assumptions of Theorem 2 are satisfied 
with, e.g., 


d 00 

fd(x) = f(xi,...,x d , 0,0,0,...) = jX 2 and g d = a 2 ^ A j 

1=1 j=d+1 

only if D is a finite interval, and then the convergence is uniform on the whole domain D ] . 


3 The setting for MDM 

We provide in this section basic assumptions concerning the approximate integration problem consid¬ 
ered in this paper. In particular, we will introduce standing assumptions (A1)-(A6) that pertain to 
the whole paper. As we shall see later, the assumptions are satisfied by a number of specific problems. 
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3.1 The function class J 7 


We introduce a class F of oo-variate real-valued functions whose integrals are to be approximated. 
For a finite subset u C N and a point x G D : \ x u denotes the variables Xj with j G u, and D u denotes 
the product integration region with the variables replaced by those in x u , where |u| denotes the 
cardinality of u. 

As in the Introduction, functions in the class T are expressed as sums of functions / u , with /„ 
depending only on the variables in x u and belonging to a normed linear space F u . 

(Al) Each / G T has a decomposition of the form (1) where the sums over all finite subsets uCN are 
defined as in (3), and each f u is formally a function on D li but depends only on the variables Xj 
with j G u. The functions fd defined in (10) are assumed to be dominantly (or even uniformly) 
convergent to /. 

(A2) Each component f u of / in (1) belongs to a normed space F u of real-valued measurable functions 
defined on D u with norm ||/ u ||f u - Moreover, ||/ u ||_f u < B u , see (4), for known positive numbers B u . 
In particular, Fq is the space of constant functions with norm given by the absolute value. Finally, 
point evaluation at any x u G D u is assumed to be a continuous linear functional on F u . 

We shall see later in Sections 5 and 6 concrete examples of the function class T. 


3.2 The integration problem 

As in (7), we express an infinite-dimensional integral 1(f) for / G F as a sum of multivariate integrals 
Iu(/u) from the decomposition / = 5^| u | <0O fu- Recalling that p is a given probability density function 
on D , we write p u (x u ) := fljeu P( x j)- Fo r u = 0, we set I%(f%) := /@. We make the following 
assumption. 

(A3) All functions in F u are Lebesgue measurable and integrable with respect to p u (x u ) d;r u , and the 
functionals I u are continuous, i.e., 


Cu ■= ||/u| 


sup 

ll/ulk„<i 



fu(x u ) Pu(x u ) dx u 


< oo. 


(17) 


At this moment, we also assume that the numbers B u in (4) and C u in (17) satisfy 

E C u B u < oo, (18) 

|u|<oo 

which will later be replaced by a stronger assumption (A4). Since |/ u (/ u )| < \\I U \\ ||/u||f u < C Y U B u , the 
condition (18) implies that the sum E u | <00 |^u(/u)| is finite. Moreover, 

sup|X(/)| < V C U B U < oo. 

feT | u| <oo 

We end this subsection with the following remark. 

Remark 4 Instead of assuming convergence in (Al), we can impose conditions on the point evalu¬ 
ation functionals as follows. Let L UyXu be the point evaluation functional on F u , L UiXu (f u ) = f u (x u ). 
Assume that for every u, we have C u := sup Xu ||L U) *ull < 00 an d 

E B u < oo. (19) 

|u|<oo 
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Then \^2^ <ao f u (x u )\ < X^| u | <0O l-^u,* u (/u)| < X^| u |<oo Ai||/u||.f u < S| u |<oo 
uniform convergence on the whole domain D n . Hence, defining the class 


C U B U < oo; that is, we have 


T* := 




we have by Theorem 2 that for any f in T* the Legesgue integral equals ‘anchored ’ integral for any 
anchor, and the integral can be expressed by any decomposition. 

For spaces F u being the |u| -fold tensor products of some space Fni of univariate functions, we 

often have that C u = and C u = Then for many families of bounds B u , including bounds of 

the form (37) to be discussed later, it is known that (see Lemma 10 below ) (18) is equivalent to (19) 
which in turn is equivalent to 

B u < oo. 

IUI < oo 


3.3 Examples of decompositions 

In practice one can expect to be given the oo-variate function /, the integration domain D ] and the 
weight function p , after which it is the user’s task to define a suitable sequence of normed spaces F u 
and a method of decomposing / into components / u £ F u . 


Example 5 The following oo-variate problem is a variant of a simpler model problem introduced 
in [25, Section 1.5], which in turn is modeled on a study [26] of a diffusion problem for the flow of a 
liquid through a porous medium treated as a random permeability field. In this example we have 


D ■= p{x) ■= 1 , f{x) 


1 


! + ££ iVi 2 ' 


( 20 ) 


For the space F u we here choose for simplicity a Hilbert space. Given the requirement that point 
evaluation be a continuous linear functional, we choose the simplest Hilbert space available to us, 
namely 

F u := H 1 . 1 := (g ) H\ 

feu 

with the conventional Sobolev-type norm 


Iblk 


( 22 \\ Va 3\\ 2 L 2p u ) N ] 

\a<(l,..,l) / 


where the sum is over all multi-indices a. with components aj £ { 0 , 1 } for j £ u, and V a denotes the 
appropriate weak mixed derivative. 

This choice of the normed spaces F u allows many different decompositions. To illustrate this point, 
let us first confine our attention to so-called “anchored” decompositions (see e.g., [29] and later), with 
anchor at some fixed a £ D. That is, for f £ F the terms f u £ F u in the decomposition (1) are defined 
by the property that 

fu(x u ) = 0 if Xj = a and j £ u. (21) 

For each fixed value of a £ D, the decomposition (1) satisfying this property is uniquely determined. 
(For a given finite subset D C N, set Xj = a for all j ^ v in (1). The only surviving terms on the 
right-hand side are those f u for which uCp, Working from the smallest subsets upwards, one proves 
inductively that each f u is uniquely determined.) For this anchored decomposition we have 


fd(x i, ...,x d ) 


22 /u(*u) = f(xi,...,x d ,a,a,...), 

uC{l,...,d} 
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which follows on setting xj = a for j > d in the decomposition ( 1 ) and using the property (21). 

Since there are an infinite number of choices for a, there are correspondingly an infinite number 
of decompositions, which are easily seen to be different. Under the conditions of Theorem 2 we know 
that each anchored decomposition will give the same value for the exact integral 1(f), no matter the 
choice of the anchor a. However, the MDM developed below will in general give different approximate 
results for different anchors. 

This choice of spaces F u also allows the so-called “ANOVA” decomposition (see e.g., [29]) and 
many other possibilities. For example, a decomposition could be determined as the result of some 
numerical computation. 

Example 6 With the same definition of F u as in Example 5, we can express the zero function by the 
zero decomposition, where the uniform convergence holds true, and we obviously have Z(0) = 0. We 
now present an example of a decomposition of the zero function which at face value gives a non-zero 
value for the integral. 

For k > 1, let ifk(t) = 2 fc+1 (l — |2 k+1 t.— l|) + with t E D = [—|, ^], i.e., ifk(t) is the ‘hat' function 
supported on [0, 2 -fe ] and satisfying f D fik(t) df = 1. For x = (xi,X 2 , £ 3 ,...) E D N , we choose 

f{i}(x) = ifi(xi), and /{!,...,£}(*) = V’fcfci) - V’fc-i(^i) for k> 2 , 

and set f u = 0 for all other finite subsets u. We obviously have that /n E Em k \ and the sum 

d 

fd(x i,...,x d ) := Y /u(») = Y /{!,-,&}(*) = Mxi) 

uC{l,...,(i} fc=l 

is pointwise, but not dominantly, convergent to zero for all x £ D fl . However, 

d 

Y J u(/u) = Y J {i.(/{1.fcj) = 1 vd > 1. 

uC{l,...,d} k=l 

This example shows that the dominated convergence of the decomposition in (10) is crucial. 

As a final comment, we remark that it is often more convenient in practice to choose the spaces 
F u to be reproducing kernel spaces based on a simple univariate kernel, because the norms of a given 
function are typically smaller. In this case the decomposition in (1) is uniquely determined. 

3.4 A strengthened assumption on C u and B u 

For our formal setting we make a stronger assumption than (18), namely we assume a certain decay 
(A4) a 0 := decay({C u B u } u ) := sup j a : Y ( C u B u ) 1/a < 00 j > 1. 

^ |u| <00 J 

The purpose of this strengthened assumption will become clear in Subsection 4.2, when we construct 
the active set. 

3.5 Allowed algorithms 

In general, the components f u in the decomposition / = X^|u|<oo /u are n °t known explicitly. Never¬ 
theless, it is assumed that we can sample f u at arbitrary points x u in the domain. Explicitly, we make 
the following assumption. 
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(A5) For a finite set ucNwe can evaluate / u (cc u ) for x u G F>l u l at cost -£(|u|), where £ is a given 
non-decreasing function. 

At this point we make no assumption about our ability to evaluate f(x), but we shall return to this 
question in Section 5. 

We assume that for each u we have at our disposal a sequence {Ai,n} ngNU { 0 } °f quadrature rules 
approximating 7 U (/ U ) as in ( 8 ), with A U) o = 0 , and, moreover, the following condition is satisfied: 

(A6) There exists q > 0 with the following property: for each u, there exist > 0 such that the 
worst case error of A u ^ n in the unit ball of F u satifies 

\\I u -A Utn \\= sup |I u (/u) - A u ,4/ u )| < gu ’^ 0 for n = 0,1,2,.... ( 22 ) 

Note that q is not uniquely defined. Since ( 22 ) holds even for n = 0, we can assume C u < G UtQ where 
C u is as in (17). We also observe that ( 22 ) implies lim )WOO ||/ u — A u>n || = 0. 

4 Multivariate Decomposition Method 

We are now ready to introduce the Multivariate Decomposition Method (MDM) in our setting. 

4.1 MDM 

As in [35,43,44], the first step of the method is to construct, for given e > 0, what we call here the 
active set U(e) - a finite collection of those subsets u C N that are most important for the integration 
problem. Specifically, under our standing assumptions (A1)-(A6), we choose a set U(e) such that 

E IW«)I < (23) 

u £U(e) 

The MDM A e (f) for the integral 1(f) is then given by ( 8 ), with the values of n u chosen such that 

|7u(/u) — A Ui n u (/ u )| < -. (24) 

uGU(e) 

It then follows that for all / gT the integration error of MDM satisfies 

\l(f)-Mf)\ < J2 l 7 u(/u)|+ E \Wn) - A u , nu (fu)\ < | + | = £• 
vtfU (e) u£U (e) 

Consequently, the worst case error of A e in T satisfies 

e(A £ \T) := sup \l(f) - A e (f)\ < e. 

The information cost is 

cost(A e ) : = E "U X(|u|). (25) 

u£U(e) 

We remark that such an active set U(e) is not unique. Note the two distinct special cases U(s) = 0 
(which corresponds to A £ = 0 ) and U(e) = {0} (which corresponds to A £ (f) = /@). 
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4.2 Constructing U(e) 

We use essentially the same approach for constructing the active set life) as in [43]. Recall that for 
each u we have |/ u (/ tt )| < C U B U , and that the sequence {C U R U } satisfies (A4). It follows that for any 
a £ (l,«o) we may define 


U{e) = U(e, a) 


(C U B U 




> 


e /2 


Z]|d|<oo(Co B v ) l /° 


(26) 


and this would yield (23). Moreover, following the proof of [43, Theorem 2] we can obtain an upper 
bound on the size of the resulting active set, as given in the proposition below. 


Proposition 7 LetU(e,a) be given by (26). Then for any e > 0 and a £ (l,«o) we have 


\U{e,a)\ < 


i 

a—1 



a 

a—1 


4.3 Constructing A l£n JJ' u ) 

The main difficulty in the construction of the algorithms A Ujnu for u £ U{e,a) is the selection of the 
numbers n u . A natural approach is to minimize the information cost (25) subject to the desired error 
bound (24) being attained. This depends on the rate of convergence of the worst case errors ||/ u — A, in || 
for fixed u and n —)• oo. For a given selection of the n u this rate is determined by (22), from which it 
follows that for any / £ T we have 

E i«/)-a,„„(/)i < E /rviT < 27) 

u £U(e,a) u&U(e,a) U 

Observe that if we take 

n u = n u (s, q) = [h u J , 

where the positive real numbers h u minimize X) u eW(e a ) hu T(|u|) subject to YlueUfe a) ^u,q B u /h^ = e/ 2 , 
then both the error and the information cost are controlled, since 

X] ^ X | and X n ^du|) < X h u <£"(|u|). 

u eW(s,a) U u &A(e,a) U u &A{e,a) u £U(s,a) 


An explicit formula for such h u can be obtained using a Lagrange multiplier argument, giving 



l/(9+l) 


(28) 


This analysis leads to the following theorem. 


Theorem 8 Under the standing assumptions (A1)-(A6), for any e > 0 and a £ (l,ao) the algo¬ 
rithm A £ with U{e) = U (e, a ) defined by (26) and n u = [/i u J with h u defined by (28) produces an 
approximation to the integral T with worst case error e(A £ ; F) < e and 


cost(M e ) < X /i u T(|u|) < 

u(HU(e,a) 


< 



, 1 + 1 /? 

X £W /{q+l) {Gu,qB u f/^ 

u £U(e,a) 

\ 1 + 1/9 

X ( Gu,q B u) 1/iq+1) ) max T(|u|). 

uefea) J ueW(£ ’ o) 


(29) 
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If we wish, we could also assume, analogously to Assumption (A4), that 

a q := decay({Gu !<? -B u } u ) := sup <j r : Y (G U)q B u ) 1/r < oo\ > 1. 

^ | u | <oo J 

In that case, if it also happens that q < a q — 1, then the sum in (29) would be uniformly bounded for 

all e > 0, 

Y ( G u, g £u) 1/(9+1) < X] (Gu^Bu) 1 /^ 1 ) < OO. 
uGW( £ > a ) |u|<oo 

Then, up to a constant, the cost in Theorem 8 would be upper bounded by £ _1 / ? max ueW (^ Q ) i?(|u|). 

Theorem 8 differs from [43, Theorem 9] in the following way: the assumed bound on the errors for 
the corresponding algorithms A U . T , U in [43, Formula (18)] involve also some logarithmic factors in n u 
which results in an overall error that is bounded by £ times a small factor depending on e. Later in 
Subsection 5.6 we will also encounter such a scenario, and we will relax the requirement (24) a little 
by neglecting those multiplicative factors when choosing n u . In this way we obtain an algorithm A e 
whose error is slightly larger than e. 

Remark 9 For a practical implementation we note that the construction of the active set U{e) = 
U (e, a) in (26) depends on the error request £ and the choice of the decomposition through the spaces 
F u and hence on values of C u and B u , and additionally on the choice of the parameter a. Moreover, 
the infinite sum from the denominator in (26) needs to be estimated from above. The values of q and 
G U)q in (22) depend on the available algorithms M U;nu and enter the formula for n u via (28). 

5 First application: anchored RKHS 

In both this and the next sections, the space F u is an anchored space, anchored at 0. This means that 
for u / 0, every function f u 6 F u satisfies (21) with a = 0, and that if u ^ 0 then F u n F v contains 
only the zero function. In Subsection 5.1 we state an explicit formula for the anchored decomposition 
to demonstrate how (A5) holds. 

In Subsection 5.2, we define the setting for the case of a general anchored reproducing kernel 
and a general domain D. In Subsection 5.3 we specialize the integration domain to [—|, ),] and the 
kernel to one that leads to a subspace of the space described in Subsection 3.3, with a redefined 
(but equivalent) norm. In Subsection 5.4 we consider again the motivating example from (20). In 
Subsection 5.5 we specialize the quadrature formulas to lattice rules. Then, in Subsection 5.6 we 
specialize the quadrature formulas to Smolyak’s quadrature, and finally develop the algorithm A s . 

5.1 Anchored decomposition and (A5) 

Recall that (A5) is about the cost of evaluating individual terms f u from / = ]C| u | <0O fu- It is shown 
in [29] that, for the anchored decomposition with anchor at 0, the value of f u (x u ) can be expressed as 
a combination of at most 2l u l values of /, specifically 

/u(*u) = E(- 1 ) |uhH /(^;°)’ 

t)Cu 

where /(® B ; 0) indicates that we evaluate f(x) with Xj set to 0 for j ^ o. Here we assume as in, 
e.g., [28], that we can sample f(x) at some cost provided that x has only finitely many Xj different 
from 0. More precisely, we suppose that we can evaluate f(x u \ 0) at the cost $(|u|), where $ : 
{0,1, 2, (0, oo) is a given non-decreasing cost function. It follows that f u {x u ) can be obtained 

at a cost of 

M /i in 

-£(M) T « S 2M$(|u|). (30) 

fc =0 ^ ' 
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5.2 The anchored RKHS setting 

Let F = H(K ) be a reproducing kernel Hilbert space of univariate functions with the kernel K : 
D X D —> M. We assume that K has an anchor 0 € D, i.e., it (0,0) = 0. For g G F, it follows from 
the reproducing property g(x) = (g,K(x,-)) F that \g(0)\ < Hs-IIf ||/F(0, •)\\ F = \\g\\ F y/K(0, 0) = 0, 
implying g(Q) = 0 and, as a special case, K(x, 0) = 0 for all x € D. 

For nonempty u, the space F u is defined to be the reproducing kernel Hilbert space with kernel 

K u (x u ,y u ) = Y[K(xj,yj). (31) 

feu 

That is, F u = H(K U ) is the |u|-fold tensor product of the space F and consists of functions whose 
variables are those listed in u. 

In the space F u so defined, point evaluation is a continuous linear functional: indeed for x u G 
D u and g u G F u we have g u (x u ) = (g u , K u (x u , ■)) Fu , and hence \g u (x u )\ < j|g„||jr u ||AT u (*u, -)ll^u = 
| g u 11 Fu (K u (x u . a^u)) 1 ^ 2 , from which it is easily seen that the norm of the point evaluation functional is 

sup |<7u(*u)| = (K u (x u ,x u )) 1/2 = ( \\_K{xj,Xj) 

IIsu||f u <i V jgu 

We may now define T to be the class of functions 

/(*) = /«(*«) wi th / u G F u and ||/ u ||f u < B u , (32) 

|u|<oo 



and such that the above series is uniformly convergent for all x G D N . The class F can be relatively 
large when the kernel K is bounded. Suppose that ||A'||oo := sup xeD K(x,x) < oo. Then it follows 
from the reproducing property of K u and from (4) that the terms in the decomposition (1) satisfy 

E !/«(*«)I = E (fn,K u (x u ,-)) Fu < E II/uIIf u (^u (* u ,* u )) 1/2 < E l|TT||H /2 . 

|u|<oo |u|<oo |u|<oo | u| <oo 


Hence, if X]|u|<oo ll^ll&>^ 2 < 00 then the convergence in (1) is automatically uniform, and we may 
define T as the class of functions (32) without further restriction. On the other hand, if the kernel 
K is unbounded then ||if Hqo = oo, and the class F can be small. An example of such a situation is 
provided by 


D = M and K(x,y) 


x\ + \y\ -\x-y | 
2 


5.3 Specializing the kernel 

We now apply our results to a special case of the reproducing kernel Hilbert space setting. We let 

D= [~bW-‘ P( x ) = 1 ’ K ( x ,y) = ^ + ^ ( 33 ) 

and take F = H(K) to be the corresponding reproducing kernel Hilbert space. Since /\(0,0) = 0, 
this is clearly an anchored space with the anchor 0. Moreover, it can easily be verified that the 
corresponding norm is given by 

r 1/2 

h\\ 2 F = / \g’( x )\ 2 dx. 

J - 1/2 
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For nonempty u with |u| < oo, let F u = H(K U ) with kernel (31). Then the norm in the space F u is 
given by 


where d^/dx u 
we now have 


- J |u| 


C>l u l 


dx. 


9u(x u ] 


d* u for any g u 6 F u 


n ? eu(^/^ x i)- F° r a faction / £ J with anchored decomposition / = 5m u i <0O f u , 



|u| 


d\»\ 

dx u 


/(* u|0) 


2 

dx u 


II fu 


2 

F u 


(34) 


which follows from f(x u ; 0) = ^ oCu f 0 (x v ) together with d^f 0 /dx u = 0 if t> is a proper subset of u. 

1 /9 

For univariate integration 1(g) = J_' 1 , 2 g(x) dx with g E F = H(K), it is easy to verify that 
Co := \\I\\f = 12 -1 / 2 . Due to the tensor product structure of F u , we have from (17) that 

C u = ||/ u || = c ] 0 ul = 12-l u '/ 2 . (35) 


5.4 Examples of integration problems 


Consider again the example in (20), but now with the more convenient choice F u = H(K U ). For u C N, 
we then have 


fix u ;0) = 


1 


<9 |u| 

and -— f(x u ; 0 ) = 


(-l)H |u|! 


i + Eax/'— (n J 6 „i 2 )a + E J6 u.v/i 2 ) |u|+1 ’ 


(36) 


and hence from (34), together with Xj > —1/2 and l/j 2 = 7 t 2 /6, 

2 \ 1 M 

|u|! 

feu 

When combined with \\K ||oo = 1/2 and (35), this gives 


II/Jf. < (l-fi)‘ 1_W |u|! IfE 2 


|A/|N/ 2 B u = 2-H/ 2 (l-fj) 


— 1—|u| 


|u|! Hr 2 and C U B U = 12-H/ 2 (l - g 

feu 

||u|/2. 


— 1—|u| 


|u|! i[ l 2 - 

jeu 


It follows from the lemma below that X^|u|<oo ||A||Jx> B u < oo (and hence the convergence in (1) is 
uniform), and that oq in (A4) is given by ao = decay({ C u B u } u ) = 2. 


Lemma 10 Let b\ > 0. Suppose the sequence {gj}j> i with gj > 0 has 


decay ({<7j}j>i) 



62 > max(6i,0). 


Then 


decay 




62. 


Proof. This is a special case of [7, Theorem 5]. Here we provide a simple proof. For any r£ (61, b-f), 

00 00 / 00 \ £ 

E (M ! ) 6 i/r rU /T = E^ ! ) bl/T EIU 1/T ^ E^) 6 l ME^ /T ) < °°> 

|u| <00 j Su 1=0 \u\=£j£u £=0 '1=1 2 

where the first inequality follows from (Ej*Li a j) ^ ^ E|u|=.n ieu aj , and in the last expression the 
finiteness of the sum over j follows from r < 62, and the finiteness of the sum over i follows from the 
ratio test using r > b\. □ 
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This example motivates us to consider in the rest of this section the case in which (A2) holds with 

B u = (|u|!) bl fj, JJ(rej) 62 for some b 2 > max(61,0) and some fi, k>0. (37) 

jeu 

The lemma with (35) then gives 


a 0 = decay({C u -B u } u ) = b 2 . 

It is easy to verify the following proposition, see [34], 

Proposition 11 For C u = 12~l u l/ 2 and B u satisfying (37), forU(e,a) defined by (26) withe > 0 and 
« G (1,62), we have 

die) := max lul = O (- — tt-J—ttz I as e —>• 0. 
ue«(e,a) Vln(ln(l/e)) / 

If £(d) = e°A) as d —>• 00, then the cost of evaluating f u (x u ) for u G a) is 

£{d{e)) = (i/ £ )0(Vin(in(i/ £ ))) as e _^ 0 . 

We end this subsection with the following two remarks. 

Remark 12 Suppose that evaluation of f([x-,u\) incurs exponentially large cost $(|u|) = e°6 u l). Then 
it follows from (30) that the cost T(|u|) of obtaining f u (x u ) for u G 7/(e, a) is still of order e°6 u l) ? and 
hence, by Proposition 11, the cost is only 

(1 / £ )°( 1 / ln ( ln ( 1 / £ ))). 


Remark 13 Although Proposition 11 limits very efficiently the cardinality of the largest subset in the 
active set, Proposition 7 suggests that the cardinality of the active set itself is still polynomial in 1/e. 
In particular, the active set may contain {1},... ,{j} for a large value of j. For the example in (20) 
we can use the following argument to limit the size of the largest label j that needs to be considered. 
This is achieved by estimating the truncation error more accurately for the specific example, rather 
than by applying bounds on the worst case error. 

For this example, it follows from (20) and (36) together with Taylor’s theorem (expanding the 
univariate function 1 /(a + y) about y = 0, with a = 1 + J2j Gu x j/j 2 and y = J2j^ u x j/j 2 ) that 


f{x) - f(x u ; 0) = 


(1 + E 


jeu p 


12 


£3 + 


(1 + C (*, u )) 3 


j 2 

j^u 


for some ((x, u) G ( 


7T 2 7T 2 \ 

12 ’ 12 )■ 


Since the integral of the first term vanishes, we have 


£[/(•)-/(• u ;0)] < 


fi - — 
12 


4 llN 
2’2 J 


1 . - X 1 




-A, I dx 


1 


12(1 


FL 13 

12 > 


£ 


1 

j 4 ' 


In particular, if we choose u = {1 : £} := {1, 2,... then we have 


n/(-)-/(•{!:*};°)i ^ 


1 


0 - 3 


36(1- fo 


21113 
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We now take I = i e so that the right hand side is less than e/3, implying t = i e = f2(e 1 / 3 ). Then we 
replace the sum X]|t>|<oo(£» B 0 ) 1/,a in (26) by Y.x>e{i-.e}(C v B v ) 1/a , replace (23) by 

y! |iu(/u)| < g! 

UG{1:£}\W( £ ) 

and replace e/2 in (24) by e/ 3. T/ien MDM can be run as usual, and will still give an error bounded by 
e, but with the simplification that subsets containing numbers bigger than t e need never be considered. 
In effect, for this example the problem can be considered as an I e -dimensional problem, rather than as 
an infinite-dimensional one. 


5.5 Specializing the quadrature to lattice rules 

It can be shown by an adaptation of known results (see, e.g., [10, Theorem 5.9]) that in the case of the 
kernel in Subsection 5.3 we can construct shifted lattice rules with n points in |u| dimensions, where 
n > 3 is prime, such that (22) holds for all q £ [1/2,1), with 


Gu.n = 2 " 


+ 12 — 1 /( 2 ?) 

On 1 / /O 


(2tt 2 ) 1 /(2?) 


M? 


where ((x) = JJkL \k~ x is the Riemann zeta function. More precisely, the above result is adapted 
from known results for lattice rules by setting the weight j u for the particular u to be 1 and all other 
weights to be 0. We follow the analysis of Subsection 4.3, but instead of taking n u = [h u \ we take n u 
to be the largest prime number such that 3 < n u < h u , or set n u = 0 if this is not achievable. The 
remaining analysis in that subsection then applies. 

More precisely, a shifted lattice rule with n points for a d -variate function g defined over [—1/2, l/2] d 
takes the form 


1 

n 






where z £ Tfi is the generating vector and A £ [0, l] d is the shift. The braces around a vector indicate 
that we take the fractional part of each component in the vector, and the subtraction by 1/2 from 
all components takes care of the translation from the standard unit cube [0, l] rf to [—1/2,1/2]°/ A 
good generating vector for the lattice rule can be constructed using the fast component-by-component 
algorithm, see e.g., [33]. The shift can be generated randomly from the uniform distribution on 
[0, l} d (in this case the error bound holds in the root mean square sense), or the shift can be generated 
repeatedly until the desired error bound is achieved (in this case the error bound holds deterministically 
but the result is not fully constructive). See, e.g., [10] for details. 


5.6 Specializing the quadrature to Smolyak’s method 

We now apply Smolyak’s [39] quadrature scheme to the MDM in the RKHS context of this section, 
with kernel (33). Smolyak’s construction is often used for tensor-product problems. It is built from a 
single family of univariate quadrature rules, and for every space F u of a given dimensionality d = |u| we 
use the same family of rules. In the following we take the univariate quadrature rules to be trapezoidal 
rules since in this setting they achieve the optimal convergence rate of order 1. 

For d > 1, Smolyak’s construction for approximating a d -variate integral is given by the formula 

d 

Qd,n = 

ieN d , |i|<« 3 = 1 


16 



where [*[ = i\ +%2 + - • - + id, Uo is the zero algorithm, and each U t for i > 1 is a (composite) trapezoidal 
rule with 2* + 1 equally spaced points t^k = —1/2 + k/2\ k = 0 , ..., 2*. Actually, we only need 2* 
evaluations for U t since for every g e F the value < 7 ( 0 ) = 0 is for free. Note that Q i A = U K . In general, 
if k < d then Qd,n = 0. 

It is easy to verify that for univariate integration we have 

/ g(t)dt - Ui{g) = / g'(t) I<i(t)dt < \\g\\ F [ / Kf(t)dt 

J- 1/2 J- 1/2 \ J—1/2 

with + ti,fe+i)/2 — t if t G [ij^, Lg.+i)- Hence the worst case error of U{ is the L 2 norm of 

Ki, which is 

\\I-Ui\\ = —!= 2“* for * = 0,1,.... 

v 12 

From [46, Lemma 1] we know that can be written in an equivalent form as 



where P{d , k) = {i £ : k — d + 1 < |*| < k}. Note that this holds for general building blocks U t . 

From [46, Lemma 6] we then have the following proposition. 



Proposition 14 For d,K& N with k > d, the error of the trapezoidal-Smolyak algorithm is 
\\I{l:d } -QdJ < 2~ K ~ 1 3 -d / 2 


K ] < 2- K - x 3~ d / 2 
d — 1, 


-d -1 


(d— 1)! 


The following proposition provides bounds on the number n(d, k) of function evaluations used by 
the trapezoidal-Smolyak algorithm Qdn- 


Proposition 15 For d, k G N with k > d we have n( 1, k) = 2 K and 


2 «— d-\-l 


< n(d, k) < 2 K ~ d+1 e^ 2 " 1 


K d ~ l 


for d> 2. 


Proof. Clearly n( 1 ,k) = 2 K . Let d > 2. The lower bound on n(d,n ) is trivial. To obtain the upper 
bound, we count only those points used by Qd, K that correspond to i = (i\,... , id) G P{d, k) in (38) 
with |i| = k, but do not count those points with any component equal to 0. The number of points 
corresponding to such i with id = 1 is 2 n(d— 1, k — 1). For successive s = 2,3,».., k— d+l, the number 
of points corresponding to i with id = s that have not been counted yet is ( 2 s — 2 s-1 ) n(d — 1, k — s). 
This yields 

K—d +1 

n(d, k) = 2 n(d — 1, k — 1) + 2 s-1 n(d — 1, k — s), d > 2. 

s=2 


We now define 


b(d, k) 


n(d, k) 
2^— d+l ’ 


k > d > 1. 


Then we have 6(1, k) = 1 and 


Av— 1 

b(d, k) = b(d — 1, k — 1) + b(d — 1, s), d > 2. (39) 

s=d— 1 
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It remains to show that for d > 2 


b(d, re) < 


ed /2-l K d -1 

(d-iy. ' 


(40) 


Since 6(2, re) = re owing to (39), inequality (40) holds for d = 2. Suppose (40) holds for some d > 2. 
Using 6(d, re — 1) < 6(d, re) and the induction hypothesis, we obtain from (39) that 


K ,— 1 


d\ 


K pd/2—1 K p(d+l)/2—1 d 

6(d + 1, re) = 6(d,re-1) +< J^6(d,s) < l E^" 1 - ' 

s=d s=d s=d 

where we used the fact that x H > x d ~ 1 is a convex function so that 


yy- < r + 1 V> d, < <"+ 1/2) ' < m 
^ “ Jd- 1/2 “ rf " d 


s=d ^-V2 

and in the last inequality we used 1 + y < e y . Thus (40) follows by induction. 
We now turn to the construction of the algorithm 

Mf) = E V(/.u). 

ueW(e,a) 


□ 


(41) 


(The “bar” in our notation for the algorithm «4 e indicates that its error is slightly larger than e, as 
we show in Theorem 16 below.) Recall that the active set U(e,a) is given by (26). For the constant 
term /g = /(0,0,...), we define the corresponding algorithm to be the one-point rule 

A d>, n<6 {h) := h = f( 0,0,...), with n 0 := 1. (42) 

For each nonempty u G U(s,a), we recall that F u is equivalent to H(Kd) with d = |u| after an 
appropriate relabeling of the variables. Therefore we define 

A u,n u := Q|u|,Ku (^3) 

for some re u to be specified below. If re u < |u| then Q| u |,k u is the zero algorithm and n u = 0; otherwise 
n u = n(|u|, re u ) is the number of function evaluations used by the trapezoidal-Smolyak algorithm Q\ u \^ u , 
see Proposition 15. 

We now express the error \\I U — ^4 u ,n u || i R ^i ie f° rm (22), with q < 1 as is appropriate for the 
trapezoidal-Smolyak algorithm in this setting. Clearly the worst case error for u = 0 is zero and so 
G$ q = 0. For any nonempty u G U{e,a) with re u > |u|, we can use Propositions 14 and 15 to obtain 
an upper bound on ||/ u — AenJI A + l) 9 ) namely 

/ J u l~i 

n — 2 -M-l+(«u-|u|+l)g o-|u|/2 (|u|/2-l)g / Ku 

Gu ’ 9 - - d 6 \(|u| — 1)! 

When re u < |u| and so n u = 0, the above error bound holds with G U)q = 12 — l u l/ 2 . 

For each nonempty u G ld(e,a), let h v be given by (28) with q < 1 and G luq = 1, and dehne 

re u := |u| + |_log 2 h u \ , (44) 

so that 

2 Ku -l u l < K < 2 K “-I U I +1 . (45) 


\ 1/2 +q 
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Then for k u > |u| we have from Proposition 15 together with (45) that 


K < n u < 2/?, u e^/ 2 1 


(M-l)!' 


( 46 ) 


(Note that n u , the number of function evaluations used by 7l u ,n u > has the same meaning here as in 
Subsection 4.3, but its connection with h u here is different from that in Subsection 4.3.) Following (27) 
with G U) g = 1 and using the lower bound from (46), we obtain 


E \W) - A,n u (f)\ < 

u &A{e,a) 


max 
u £U(e,a) 


G 


u ,q 


E 

u £U(e,a) 


Bu 

hi 


The upper bound from (46) yields 


E «u-£(M) < 

u £U(e,a) 


max 2 e 

u£W(e,a) 


| u | /2—1 K u 


M-i 


(M-l)! 


u£U(e,a) 


K£{ M) 


(47) 


(48) 


From the derivation which leads to the definition of h u in (28), we conclude that the second factor 
on the right-hand side of (47) is e/2, while the second factor on the right-hand side of (48) can be 
bounded as in (29). This leads to the following theorem. 


Theorem 16 For the reproducing kernel Hilbert space setting specified by (33), for any e > 0, a € 
(l,«o) and q < 1, the algorithm A £ withU(e ) = U(e,a) defined by (26), Ai,n u defined by (42) and (43), 
and k u defined by (44), produces an approximation to the integral I with error 

e(A e \X) < eX(e,a,q), 


and cost 


where 


cost(„4, £ ) < 


1/9 


u£U(e,a) 


\ 1+1/9 
,l/(9+lA 

I maA 

J u eW(e,a) 


El Bu [q ^ ±> max T(|u|) Y(s, a), 


l u l 1 \ l/2+g 


X(e,a,q) = max 2 -l u l- 1 +(^-H+ 1 )9 3 -M / 2 glM/ 2 - 1 )'? 

u £U(e,a) V (l u l — !)! 7 


and 


l u l —1 

Y(e,a) = max 2 e ! u l/ 2 — 1 —E--. 

u eu(e,a) (|u| - 1)! 


Since B u of the form (37) implies |u| < d(e) = 0(ln(l/e)/ln(ln(l/e))) for u £ U(e,a), one can 
show using proof techniques similar to those from [34] that both X(e, a , q) and Y (e, a ) equal 


Corollary 17 Under the conditions of Theorem 16, for B u of the form (37) we have 

e(A £ ;X) < e 1 -^), 

where 5(e) = 0(l/ln(ln(l/e))). Moreover, if £(d) = e°^ then 

]\ 1/9+<H £ ) 


cost(^l £ ) < 2 1+1/,g 


E * 

u£U(e,a) 


\ 1+1/9 
1 /( 9 + 1 ) | 


19 



6 Second application: a non-Hilbert setting 

Next we consider an example which is in an anchored space setting, but not in a Hilbert space setting. 


6.1 Problem formulation 


Let D = M + = [0, oo) and let F be the space of (locally) absolutely continuous functions g : D —> M 
such that 

£f(0) = 0 and \\g\\ F := ||c/||oo < oo. 

The space F u is the completion of the |u|-fold algebraic tensor product of F whose functions depend 
only on variables listed in u. The completion is with respect to 


ll/ulk : = 


0W 


dx u ‘ 


Note that F u is not a Hilbert space; however it is anchored at 0. 

In the univariate case, for g 6 F and x G M + we can write g(x) = fg g (t) d t and therefore 
|g(x)| < x 11 g' | |oo, from which it follows easily that the functional for evaluation at the point x has the 
norm x. In a similar way it follows that the point evaluation functional for the finite subset u has the 
norm sup^u^! |s u (*u)| = Yljeu x r 

We are interested in approximating the weighted integral of / 6 F, where the weights are p(x ) = 
exp(— x) for the univariate case, and 


Pu(x u ) ■= JJexp(-a;j) 

ieu 



for the multivariate case. The class F can then be defined as the set of all uniformly convergent sums 
of functions f u € F u . To obtain the functional for integration, note first that for the univariate case, 
by integration by parts, 

poo poo 

Ha) ■= / f(x) exp(-z) dx = / f'(t) exp(-t) df, (49) 

Jo Jo 

hence the integration functional has the norm 1, leading to C u = ||/ u || = 1. 


6.2 Smolyak’s construction 

We approximate the univariate integral (49) by algorithms U t that are weighted versions of the (com¬ 
posite) trapezoidal rules using the points 

Xi k := —2 In (1- r —) , 0 < k < 2 i . 

Specifically, Uq = 0, and for % > 1 we have Ui(f) = Ylk=i a i,k9(xi,k ) with 

0 (-1 _ 0 %i,k 0 _ 0 3Ci,k — 1 

a,ik =-, 1 < k < 2 l — 1, and a j 2 < = 

%i,k %i,k %i,k— 1 

It was shown in [35] that 

\\I — Ui\\ < Ci 2~ % with C\ = 1.00656, and \\Ui~Ui-\ 


0C ■ rj 7 CC • i 

0 _ 0 i,2 L — 1 

x i,2 i ~ x i,2 i -l 


< 2 1 ~ i . 
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This perfectly fits the setting of [46, Lemmas 2 and 7]. It follows that for the corresponding Smolyak’s 
algorithm A a , Un = Q\ u \ jKu for |u|-variate integrals as in (43), n u > |u|, we have 


| In - A u , nu || < Cl 2-fo““l u l +1 ) (* u _ ^ 


and 


n u = n 


i(|u|,/eu) = 2 k “- |u|+1 ^ <2 ^-H+i^ 

Hence (22) holds for q < 1 with 

G U)q = Ci 2 («- 1 )fou-|u|+i) ^ J +y if Ku 
and G U)q = 1 if k u < |u| (in which case n u = 0 and A u .q = 0 ). 


« u 

u| - 1 


- 1. 


> M> 


6.3 Specializing MDM 

Taking C u = 1 in (A4) and (26), and proceeding as in Subsection 5.6 we obtain a result corresponding 
to Corollary 17. 

Corollary 18 In the setting of this section, we use the algorithm A e defined by (41) with q < 1 and 
h u and n u defined by (28) and (44). For B u of the form (37) with 62 > max(&i, 1) we have 

e(A £ -,F) < e 

where 5(e) = 0(l/ln(ln(l/e))). Moreover, if £(d) = e 0 ^ then 

cost(A e ) < O (e-( 1+,5 ( £ ))) . 
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